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1 Abstract 

I review evidence for the claim that syntactic ambiguities 
are resolved on the basis of the meaning of the compet- 
ing analyses, not their structure. I identify a collection 
of ambiguities that do not yet have a meaning-based ac- 
count and propose one which is based on the interaction 
of discourse and grammatical function. I provide evi- 
dence for my proposal by examining statistical properties 
of the Penn Treebank of syntactically annotated text. 



2 Introduction 

On what basis do people resolve the syntactic ambiguity 
so common in language? Some researchers (e.g. Frazier 
and Fodor 1978, Mitchell Corley and Garnham 1992 and 
references therein) have argued that the sentence proces- 
sor embodies structurally defined criteria such as Mini- 
mal Attachment and Late Closure. Others (e.g. Grain and 
Steedman 1985, Altmann and Steedman 1988, Trueswell 
and Tanenhaus 1991) have argued that ambiguity resolu- 
tion decisions are made online, rather quickly, based on 
the relative sensibleness of the available analyses. Psy- 
cholinguistic research of this issue has focused on a lim- 
ited collection of structures such as: 

(1) a. The horse raced past the barn fell. 

b. The doctor told the patient that he was having 
trouble with to leave. 

Out of context, sentences with structures similar to ei- 
ther of these examples are garden-paths. When such sen- 

*This paper appears in the Proceedings of the Fifteenth Annual 
Meeting of the Cognitive Science Society (1993). It is electronically 
archived i n the Computation and Language E-Print Archive as c mp- 
lg/9406028^ 1 wish to thank Bob Frank, Susan Uarnsey, Young-Suk 
Lee and Mark Steedman for their helpful suggestions. Any remain- 
ing errors are my own. This research was supported by the following 
grants: DARPA N00014-90-J-1863, ARO DAAL03-89-C-0031, NSF 
IRI 90-16592, Ben Franklin 91S.3078C-1. 



tences are put in contexts that support the correct read- 
ing, the garden-path effect disappears. Some questions 
remain about how quickly such discourse-sensitive sensi- 
bleness preferences are brought to bear on the ambiguity 
resolution process (Mitchell et al. 1992). But this issue 
will not be addressed further here. 

One attractive aspect of structural preference theories 
is that two very simple structural strategies can account 
for so much data. Aside from the two examples in (1) 
Minimal Attachment defined in [2) predicts that (3) is a 
garden-path. 

(2) Minimal Attachment (Frazier and Fodor 1978): 
Each lexical item (or other node) is to be at- 
tached into the phrase marker [in the way which 
requires the smallest] possible number of nonter- 
minal nodes linking it with the nodes which are 
already present. 

(3) John has heard the joke about the pygmy is of- 
fensive. 

Late Glosure|(4^ predicts difficulties with the sentences 



in :5) 



(4) Late Glosure (Frazier and Rayner 1982): When 
possible, attach incoming lexical items into the 
clause or phrase currently being processed (i.e. 
the lowest possible nonterminal node dominating 
the last item analyzed). 

(5) a. John said that Bill will leave yesterday. 

b. When the cannibals ate the missionaries 
drank. 

c. Without her contributions failed to come in. 
(from Pritchett 1988) 

d. When they were on the verge of winning the 
war against Ghurchill and Roosevelt met in 
Yalta to divide up postwar Europe, (from 
Ladd 1992) 



The claim of this paper is that ambiguity resolution de- 
cisions are based solely on the sensibleness of the avail- 
able readings. I argue that for each structure in (3), 
and (5), aspects of meaning which must be assumed for 
independent reasons are responsible for the ambiguity 
resolution behavior observed in humans. 



3 Existing Accounts 

Grain and Steedman (1985) argued that resolution of 



the ambiguity in (1)3 is sensitive to the prior discourse. 



When there is a unique patient in the discourse, the NP 
'the patient' uniquely identifies him/her and there is no 
need for further restrictive modifiers. Consistent with 
Grice's (1965) maxim of Manner (be brief), the hearer 
selects the complement clause analysis of 'that he. . .'. 
When there are two patients in the discourse, the NP 'the 
patient' does not pick out a unique referent, so Grice's 
maxim of Quantity (make your contribution as informa- 
tive as required) guides the hearer to construe the contin- 
uation 'that he. . .' as a restrictive relative clause. 

When the sentence is presented out of context, the 
hearer must accommodate a situation in which it is fe- 
licitous. That is, the hearer must change his/her mental 
model to support the presuppositions carried by the sen- 
tence. The restrictive relative clause reading requires ac- 
commodating a more complex situation in which there is 
more than one patient, so according to a preference for 
parsimony, the complement clause analysis is preferred. 

In another study Grain and Steedman (1985) found that 
the garden path effect in sentences with structures as in 
(l)i is reduced/eliminated when the main- verb reading 
is implausible. This finding was replicated by Pearlmut- 
ter and MacDonald (1992). Others (e.g. Trueswell and 
Tanenhaus 1991) have found other aspects of meaning 
(e.g. temporal coherence) are also relevant for the ambi- 
guity in|(l)|a. 

Niv (1992) argued that the tendency for low attachment 
of adverbials such as 'yesterday', as in (5) a, results from 
a general principle of competence to order constituents 
in the sentence in increasing order of "information vol- 
ume". Attaching 'yesterday' high is dispreferred as this 
single-word constituent would have to follow the heavier 
constituent 'that Bill will leave'. This preference disap- 
pears when the adverbial is made heavier, e.g. 'because 
he became very angry at us.' 

Stowe (1989, experiment 1) found that for a certain 
class of ambiguous verbs, plausibility manipulations can 
affect the garden path effect in 5)b. For verbs which ex- 
hibit causative/ergative alternations, illustrated in [6^ , re- 
ducing the plausibility of the subject to serve as the causal 
agent (e.g. [(7)|b) picks out the ergative reading and avoids 
the garden path. 



(6) Gausative: John moved the pencil. 
Ergative: The pencil moved. 



(7) a. Before the police stopped the driver was al- 
ready getting nervous, 
b. Before the truck stopped the driver was al- 
ready getting nervous 



Recent research on the local ambiguity in ^3) has fo- 
cused on whether subcategorization preferences of the 
matrix verb affect the garden-path. When the matrix verb 
is one which tends to take a clausal complement more 
often than an NP complement, the garden path may be 
avoided. As with other psycholinguistic facts summa- 
rized here, the claims are still controversial, but a recent 
experiment by Garnsey, Lotocky and McGonkie (1992) 
seems to settle the issue in favor of lexical preference ef- 
fects. 

In summary, there are meaning-based accounts for peo- 
ple's ambiguity resolution preferences for l)a, b, and 



(5) a. Lexical preference accounts for some of the obser- 
vation, in [3^ andFsJ), but cannot alone account for the 



garden paths in 8) 



(8) a. John finally realized just how wrong he had 
been remained to be seen, 
b. When Mary returned some of the presents 
were missing. 



(8) a has the same structure as (3' and the verb 'realize' is 
biased toward a sentential complement reading (accord- 
ing to Garnsey et al/s findings, as well as the Brown Cor- 
pus). But there is still a perceptible garden-path effect. 
(8)b has the same structure as [5)3 and the verb 'return' 



occurs more frequently in the Brown Corpus without an 
object than with one. But again, there is still a garden 
path effect. 



4 Avoid Subjects 

All of the examples above that are not fully accounted 
for, i.e. (5):, d, (8) a, b, exhibit a common property — 
a certain NP, which I will call the critical NP, has both 
a subject and a non-subject analysis. In each of these 
examples, the non-subject analysis is preferred. 

There are a few hints in the literature that readers really 
do prefer to avoid subjects. One hint comes from a sec- 
ond experiment that Stowe (1989) conducted. In addition 
to manipulating the agenthood of the subject of the first 
clause, this experiment also manipulated the plausibility 
of the critical NP to serve as the direct object. Sample 
experimental materials are given in (9). 



(9) Animate: 

Plausible: When the police stopped the driver 

became very frightened. 
Implausible: When the police stopped the silence 

became very frightening. 

Inanimate: 

Plausible: When the truck stopped the driver 
became very frightened. 

Implausible: When the truck stopped the silence 
became very frightening. 

Stowe found implausibility effects at the critical NP even 
in the inanimate sentences, where the readers exhibit 
commitment to the ergative (intransitive) analysis. 

Another hint comes from an experiment by Holmes, 
Kennedy and Murray (1987). Using experimental materi- 



als as in '10) , Holmes et al. found that in the disambigua- 



tion region (either 'to the officer' or 'had been changed') 
the transitive verb sentence (TR) was read substantially 
faster than the other two sentences. 

(10) (TR) The maid disclosed the safe's location 
within the house to the officer. 

(TC) The maid disclosed that the safe's location 
within the house had been changed. 

(RC) The maid disclosed the safe's location 
within the house had been changed. 

The that-complement (TC) sentence was read slightly 
faster than the reduced complement (RC) sentence. This 
finding was subsequently replicated by Kennedy et al. 
(1989). It is quite surprising that the unambiguous con- 
dition TC should take longer than the locally ambiguous 
condition TR. Holmes et al. 's speculation that beginning 
a new clause requires additional processing is consistent 
with a strategy of avoiding analyzing NPs as subjects. 

My claim is not that the processor is averse to subjects 
in general, but rather that it prefers to avoid only a cer- 
tain class of subjects, namely those which are new to the 
discourse. Given that all of the examples above are pre- 
sented out of context, it is clear that all critical NPs are 
new to the hearer/reader's model of the discourse. 



5 Given and New 

Prince (1981) proposed a classification of occurrences of 
NPs in terms of assumed familiarity. When a speaker 
refers to an entity which s/he assumes salient/familiar to 
the hearer, s/he tends to use a brief form, such as a definite 
NP or a pronoun. Otherwise s/he is obliged to provide the 
hearer with enough information to construct this entity in 
the hearer's mind. Prince classified the forms of NPs and 
ranks them from given to new: 



evoked An expression used to refer to one of the con- 
versation's participants or an entity which is already 
under discussion, (usually a definite NP or pronoun) 

unused A proper name which refers to an entity known 
to the speaker and hearer, but not already in the 
present discourse. 

inferable A phrase which introduces an entity not al- 
ready in the discourse, but which is easily inferred 
from another entity currently under discussion, (c.f. 
bridging inference of Haviland and Clark 1974) 

containing inferable An expression that introduces a 
new entity and contains a reference to the extant dis- 
course entity from which the inference is to proceed, 
(e.g. 'one of the people that work at Penn') 

brand new An expression that introduces a new entity 
which cannot be inferentially related or predicted 
from entities already in the discourse. 

Prince constructed this scale on the basis of scale-based 
implicatures that can be drawn if a speaker uses a form 
which is either too high or too low — such a speaker 
would be sounding uncooperative/cryptic or needlessly 
verbose, respectively. 

Using this classification. Prince found that naturally 
occurring texts exhibit a significant tendency to avoid 
placing new NPs (including inferable and unused) in sub- 
ject position. If we construe this tendency as a principle 
of the linguistic competence, we would indeed expect a 
reader to prefer to treat an out-of-context NP as some- 
thing other than a subject. I refer to this principle as Avoid 
New Subjects. 

6 Late Closure and Avoid New Subjects 

The principle of Avoid New Subjects predicts that in ordi- 
nary text the Late Closure Effect exhibited in [sjb should 
disappear when the critical NP is given in the discourse. 
To test this prediction, I conducted a survey of the brack- 
eted Brown and Wall Street Journal corpora for the fol- 
lowing configuration: a VP which ends with a verb and 
is immediately followed by an NP. Crucially, no punc- 
tuation was allowed between the VP and the NP. I then 
removed by hand all matches where there was no ambi- 
guity, e.g. the clause was in the passive or the verb could 
not take the NP as argument for some reason. Of the 
eleven remaining matches, four are are given below. Each 
is preceded by some context, and followed by illustration 
of the ambiguity (in brackets), and by a categorization of 
the critical NP. 

1. [An article about a movie describes how its com- 
poser approached one of the singers.] When you 



approach a singer and tell her you don't want her to 
sing you always run the risk of offending. 

['You don't want her to sing you a song.'] 

'you' = evoked. 

2. From the way she sang in those early sessions, it 
seemed clear that Michelle (Pfeiffer) had been lis- 
tening not to Ella but to Bob Dylan. "There was 
a pronunciation and approach that seemed Dylan- 
influenced," recalled Ms. Stevens. Vowels were 
swallowed, word endings were given short or no 
shrift. "When we worked it almost became a joke 
with us that I was constantly reminding her to say 
the consonants as well as the vowels." 

['When we worked it out. . .'] 

'it' = pleonastic. 

3. After the 1987 crash, and as a result of the rec- 
ommendations of many studies, "circuit breakers" 
were devised to allow market participants to regroup 
and restore orderly market conditions. It's doubtful, 
though, whether circuit breakers do any real good. 
In the additional time they provide even more order 
imbalances might pile up, as would-be sellers finally 
get their broker on the phone. 

[Even though this example in- 
volves a wh-dependency, the fact remains that the 
NP 'even more order imbalances' could be initially 
construed as a dative, as in 'In the additional time 
they provide even the slowest of traders, problems 
could. . .'] 

'even more order imbalances' = brand new 

4. [Story about the winning company in a competition 
for teenage-run businesses, its president, Tim Lar- 
son, and the organizing entity. Junior Achievement.] 
For winning Larson will receive a $100 U.S. Savings 
Bond from the Junior Achievement national organi- 
zation. 

[. . .winning Larson over to their camp. . .] 
'Larson' = evoked 

As can be expected of carefully written prose, none of 
the matches posed any reading difficulty. Of the eleven 
critical NPs, four were pleonastic, five were evoked, one 
was inferable and one was brand new. Prince's given- 
ness scale does not include pleonastic NPs, since they do 
not refer. For the present purpose, it suffices to note that 
Avoid New Subjects does not rule out pleonastics. While 
the numbers here are too small for statistical inference]^ 

^ Given the high frequency of given subjects, optionally transitive 
verbs and fronted adverbials, one might expect more matches in a two 



the data suggest that the prediction of Avoid New Sub- 
jects is maintained. 



7 Complement Clauses 



In order to be relevant for the ambiguity in (3) Avoid New 
Subjects must be applicable not just to subjects of matrix 
clauses but also to embedded subjects. It is widely be- 
lieved that constituents in a sentence tend to be ordered 
from given to new. The statistical tendency to avoid new 
subjects may be arising solely as a consequence of the 
tendency to place new information toward the end of a 
sentence and the grammatically-imposed early placement 
of subjects. If this were the case, that is. Avoid New Sub- 
jects is a corollary of Given Before New, then Avoid New 
Subjects would make no predictions about subjects of 
complement clauses, given that complement clauses tend 
to appear rather late in the sentence. I now argue from the 
perspective of sentence production that it is the grammat- 
ical function of subjects, not just their linear placement in 
the sentence, that is involved with the avoidance of new 
information. 

When a speaker/writer wishes to express a proposition 
which involves reference to an entity not already men- 
tioned in the discourse, s/he must use a new NP. S/he is 
quite likely to avoid placing this NP in subject position. 
To this end, s/he may use constructions such as passiviza- 
tion, there-insertion, and clefts. 

Avoid New Subjects predicts that this sort of effort on 
behalf of writers should be evident in both matrix clauses 
and complement clauses. To test this prediction, I com- 
pared the informational status of NPs in subject and non- 
subject positions in both matrix and embedded clauses. I 
defined subject position as an NP immediately dominated 
by S and followed (not necessarily immediately, to allow 
for auxiliaries, punctuation, etc.) by a VP. I defined non- 
subject position as an NP either immediately dominated 
by VP or immediately dominated by S and not followed 
(not necessarily immediately) by VP. To determine given- 
ness status, I used a simple heuristic procedure^ to clas- 
sify an NP into one of the following categories: EMPTY- 
CATEGORY, PRONOUN, PROPER-NAME, DEFINITE, IN- 
DEFINITE, NOT-CLASSIFIED. The observed frequencies 
are given in table 1 at the end of this paper. 



million word coipus. But examination of the Wall Street Journal corpus 
reveals that most fronted adverbials are set off by comma, regardless 
of potential ambiguity. Of 7256 sentence initial adverbials, only 8.14% 
(591) are not delimited by comma. Of these 7256 adverbials, 1698 have 
the category SBAR, of which only 4.18% (71) are not delimited by 
comma. The great majority of fronted adverbials (4515) have category 
PP, of which 8.75% (433) are not dehmited by comma. 

^I am grateful to Robert Frank for helpful suggestions regarding this 
procedure. 



PRONOUNS are either pleonastic or evoked — they are 
thus fairly reliable indicators of given (at least non-new) 
NPs. The category INDEFINITE contains largely brand- 
new or inferable NPs, thus being a good indicator of new 
information. Considering PRONOUNS and INDEFINITES 
there is a clear effect on grammatical function for both 



Notice that this example requires the simultaneous pro- 
cessing of only two new subjects, the most embedded 
subject being evoked. When this subject is replaced with 
a new NP, the sentence becomes harder to process. 

(14) A book that some Italian most people have never 





matrix clause 


embedded clause 


press. 




subj 


non-subj 


subj 


non-subj 


PRONOUN 


7580 


956 


1800 


213 


Also, when a complex, center-embedded NP appears in 


INDEFINITE 


4157 


5269 


736 


899 


subject position (15) i, it is harder to process than when 






= 3952.2 




= 839.5 


it appears in object position (15)3. (15' is based on an 




P 


< 0.001 


P 


< 0.001 


example from Gibson (1991) which is in turn based on 



The prediction of Avoid New Subjects is therefore veri- 
fied. 

When a hearer/reader is faced with an initial- segment 
such as (11), the ambiguity is not exactly between an NP 
complement analysis versus an S-complement analysis, 
but rather between an TR (transitive verb) analysis and 
an RC (reduced S-complement). 

(11) John has heard the joke. . . 

It is therefore necessary to verify that Avoid New Sub- 
jects is indeed operating in this RC sub-class of senten- 
tial complements. A further analysis reveals that this is 
indeed the case. 





TC 

subj non-subj 


RC 

subj non-subj 


PRONOUN 
INDEFINITE 


773 79 
617 555 


1027 134 
119 344 




X"" = 332.6 
p < 0.001 


= 627.6 
p < 0.001 



If anything. Avoid New Subjects has a stronger effect af- 
ter a zero complementizer 



8 Unambiguous Structures 

The findings of Holmes et al. (1987), that effects pre- 
dicted by Avoid New Subjects are present even in unam- 
biguous structures, transfer to other unambiguous struc- 
tures, such as the classical center embedding sentence: 

(12) The rat that the cat that the dog bit chased died. 

In this sentence three new subjects must be accommo- 
dated simultaneously. Of all the examples that I have seen 
of sentences with a structure as in If 12(1, the easiest one to 



understand is (13) from Frank (1992). 



(13) A book that some Italian I've never heard of 
wrote will be published soon by MIT press. 



earlier work by Cowper 
(15) 



■^For reasons of space I only give results from the Brown corpus, but 
all assertions I make also hold of the Wall Street Journal corpus. 



Many bureaucrats who the information that 
Iraq invaded Kuwait affected negatively work 
for the government. 

The government employs many bureaucrats 
who the information that Iraq invaded Kuwait 
affected negatively. 



I don't wish to claim that the new-subject effect is 
solely responsible for all of the difficulty associated with 
center embedding, but it is clear that it is playing an im- 
portant role. 

9 Conclusion 

I have argued for an account of sentence processing 
wherein the syntactic processor appUes the rules of the 
competence grammar blindly and faithfully. Ambiguity 
resolution decision are made by the interpreter when it 
considers the analyses which the syntactic processor has 
proposed and evaluates them on the basis of sensibleness. 
The criteria I have appealed to: Grice's maxims, ordering 
constituents by the amount of information they convey, 
and not putting new information in subject position, are 
all components of our knowledge of language and not the 
exclusive domain of the process of parsing, nor that of 
production. 

Avoid New Subject predicts that for sentences such as 
(3) , the garden path effect should disappear when the sen- 
tence is put in a context in which the NP is given, and its 
form is felicitous. The design and execution of an exper- 
iment to test this prediction remain for future research. 
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